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PROTEASES FROM GRAM-POSITIVE ORGANISMS 



FIELD OF THE INVENTION 

The present invention relates to cysteine proteases derived from gram- 
positive microorganisms. The present invention provides nucleic acid and amino acid 
sequences of cysteine protease 1, 2 and 3 identified in Bacillus. The present invention also 
provides methods for the production of cysteine protease 1, 2 and 3 in host cells as well as 
the production of heterologous proteins in a host cell having a mutation or deletion of part or 
all of at least one of the cysteine proteases of the present invention. 

BACKGROUND OF THE INVENTION 

Gram-positive microorganisms, such as members of the group Bacillus, have been 
used tor iarge-scaie indusirial fei mentation due, in part, tc their ability to secrete t. .eir 
fermentation products into the culture media. In gram-positive bacteria, secreted proteins 
are exported across a cell membrane and a cell wall, and then are subsequently released 
into the external media usually maintaining their native conformation 

Various gram-positive microorganisms are known to secrete extracellular and/or 
intracellular protease at some stage in their life cycles Many proteases are produced in 
large quantities for industrial purposes. A negative aspect of the presence of proteases in 
gram-positive organisms is their contribution to the overall degradation of secreted 
heterologous or foreign proteins. 

The classification of proteases found in microorganisms is based on their catalytic 
mechanism which results in four groups, the serine proteases, metalloproteases; cysteine 
proteases; and aspartic proteases. These categories can be distinguished by their 
sensitivity to various inhibitors. For example, the serine proteases are inhibited by 
phenylmethylsulfonylfluonde (PMSFJ and dusopropyifiuorophosphate (DIFP); the 
metalloproteases by chelating agents; the cysteine enzymes by iodoacetamide and heavy 
metals and the aspartic proteases by pepstatin. The serine proteases have alkaline pH 
optima, the metalloproteases are optimally active around neutrality, and the cysteine and 
aspartic enzymes have acidic pH optima ^Biotechnolog y Handbooks, Bacillus, vol. 2, edited 
by Harwood, 1989 Plenum Press, New York) 

The activity of cysteine protease depends on a catalytic dyad of cysteine and 
histidine with the order differing among families. The best known family of cysteine 
proteases is that of papain having catalytic residues Cys-25 and His-159. Cysteine 
proteases of the papain family catalyze the hydrolysis of peptide, amide, ester, thiol ester 
and thiono ester bonds. Naturally occurring inhibitors of cysteine proteases of the papain 
family are those of the cystatm family (Methods in Enzymology, vol 244, Academic Press, 
Inc. 1994). 
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SUMMARY THE INVENTION 

The present invention relates to the unexpected and surprising discovery of three 
heretofore unknown or unrecognized cysteine proteases found in Bacillus subtilis. 
designated here.n as CP1 , CP2 and CPS, having the nucle,c acid and am.no acd as shown 
in F.oures 1A-1B. Figures 5A-5B and 6A-6B, respectively. The present invention ,s based, .n 
part "'upon the presence of the characteristic cysteine protease ammo acd motif GXCWAF 
found in uncharacterised translated genomic nucle.c acid sequences of Bacillus subt,l,s. 
The present invention is a!so based in part upon the structural relatedness that CP1 has 
with the cysteine protease papain specifically with respect to the location of the catalytic 
histid.ne/alanine and asparagine/serine residues and the structural relatedness that CP1 

has with CP2 and CP3. 

The present invention provides isolated polynucleotide and ammo aad sequences 
for CP1 CP2 and CP3. Due to the degeneracy of the genetic code, the present invention 

' aP iH ^nnpnce that encodes the CP1. CP2 and CP3 amino acid 

encompasses any i .wo.. ,- - 

sequence shown in the Figures. 

The present invention encompasses amino acd variations of B.subt,lis CP1 . CP2 
and CPS am.no acids d.sclosed herein that have proteolytic activity. B. subtilis CP1, CP2 
and CPS as we.l as proteolytical.y active ammo acid variations thereof, have application ,n 
cleaning compositions. In one aspect of the present invention, CP1 , CP2 or CPS obtainable 
from a gram-positive microorganism is produced on an industrial fermentation scale ,n a 
microbial host expression system. In another aspect, isolated and punf,ed recombinant 
CP1 CP2 or CPS obtainable from a gram-positive microorganism is used ,n compositions of 
matter intended for cleaning purposes, such as detergents. Accordingly, the present 
invention provides a cleaning composition com P r,s,ng at least one of CP1. CP2 and CPS 
obtainable from a gram-posit.ve microorganism. The cysteine protease may be used alone 
in the cleaning composition or in combination with other enzymes and/or mediators or 
enhancers. 

The production of desired heterologous proteins or polypeptides in gram-pos.tive 
microorganisms may be hindered by the presence of one or more proteases which degrade 
the produced heterologous protein or polypeptide. Therefore, the present invention also 
encompasses gram-pos.tive microorganism having a mutation or deletion of part or all of the 
gene encoding CP1 and/or CP2 and/or CPS, wh,ch results in the inactivat.on of the CP1 
and/or CP2 and/or CPS proteolytic activity, either alone or in combination with deletions or 
mutations in other proteases, such as a P r, npr. epr, mor for example, or other proteases 
- known to those of skill in the art. In one embodiment of the present invention, the gram- 
positive organism is a member of the genus Bacillus. In another embodiment, the Bacllus 
is Bacillus subtilis. 

In another aspect, the gram-posit,ve microorganism host having one or more 
deletions or mutations in a cysteine protease of the present invention is further genetically 
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engineered to produce a desired protein. In one embodiment of the present invention, the 
desired protein is heterologous to the gram-positive host cell. In another embodiment, the 
desired protein is homologous to the host cell. The present invention encompasses a gram- 
positive host cell having a deletion or interruption of the naturally occurring nucleic acid 
encoding the homologous protein, such as a protease, and having nucleic acid encoding 
the homologous protein or a variant thereof re-introduced in a recombinant form. In another 
embodiment, the host cell produces the homologous protein. Accordingly, the present 
invention also provides methods and expression systems for reducing degradation of 
heterologous or homologous proteins produced in gram-positive microorganisms comprising 
the steps of obtaining a Bacillus host cell comprising nucleic acid encoding said 
heterologous protein wherein said host cell contains a mutation or deletion in at least one of 
the genes encoding cysteine protease 1, cysteine protease 2 and cysteine protease 3; and 
growing said Bacillus host cell under conditions suitable for the expression of said 
heterologous protein. The gram-positive microorganism may be normally sporulating or 
non-sporulating. 

The present invention provides methods for detecting gram positive microorganism 
homologs of B. subtilis CP1. CP2 and CP3 that comprises hybridizing part or all of the 
nucleic acid encoding B. subtilis CP1, CP2 and CP3 with nucleic acid derived from gram- 
positive organisms, either of genomic or cDNA origin. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1A-1B shows the DNA (SEQ ID NO:1) and ammo acid sequence forCPI 
(YJDE) (SEQ ID NO:2). 

Figure 2 shows an amino acid alignment with papain (SEQ ID NO:3) (accession 
number papa_carpa p) with the cysteine protease CP1 , designated YJDE. For Figures 2, 3 
and 4, the motif GXCWAF has been marked along with the catalytic cysteine and the 
conserved catalytic histidine/alanine and asparagine/serine residues. 

: ' - Figure^Vhows amino acid alignment of CP1 (YJDE) (SEQ ID NO:2) with CP3 (PMI) 
(SEQ ID NO:5j.. .. , 

V ^ Figure Vfrshows the amino acid alignment of CP1 (YJDE) (SEQ ID NO:2) with CP2 
(YdhS). 

Figure 5A-5B shows the amino acid (SEQ ID NO:6) and nucleic acid sequence for 
CP2 (YdhS) (SEQ ID NO:7). 

Figure 6A-6B shows the amino acid (SEQ ID NO:4) and nucleic acid sequence for 
CP3 (PMI) (SEQ ID NO:5). 
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DETAILED ngQr.RIPTION OF TH F PRPFFRRED EMBODIMENTS 

Definitions 

As used herein, the genus Bacillus includes all members known to those of skill in 
the art including but not limited to B. subtilis, B. licheniformis. B. lentus. B. brevis. B. 
stearcthermophilus, B. alkalophilus, B. amyloliquefaciens, B. coagulans, B, cculans, B. 
lautus and B. thuringiensis. 

The present invention relates to novel CP1. CP2 and CP3 from gram positive 
organisms. In a preferred embodiment, the gram-positive organisms is a Bacllus. In 
another preferred embodiment, the gram-positive organism is Bacillus subtil*. As used 
herein "6 subtilis CP1, CP2 or CPS" refers to the ammo acid sequences shown >n Figures. 
Figures 1 A-1 B show the am,no acid and nuc.eic acid seqeunce for CP1 (YJDE); Figures 5A- 
5B show the amino acid and nucleic acid sequence for CP2 (YDHS); and Figures 6A-6B 
show the amino acid and nucle,c acid sequences for CP3 (PMl). The present invention 

__=-, „f »h a ammn arid seauences disclosed in Figures 1A- 
encompasses ammu cauiu vaMau^,.- w, 

1B and 5A-5B and 6A-6B that have proteolytic activity. Such proteolytic amino acd variants 
can be used in cleaning compositions. . 

As used herein "nucleic acid" refers to a nucleotide or polynucleotide sequence, and 
fragments or portions thereof, and to DNA or RNA of genomic or synthetic origin wh.ch may 
be double-stranded or single-stranded, whether representing the sense or antisense strand. 
As used here.n "ammo acd" refers to peptide or protein sequences or portions thereof. A 
••polynucleotide homolog" as used here.n refers to a gram- P os,t,ve m.croorgan.sm 
polynucleotide that has at least 80%. at least 90 % and at least 95% identity to B.subt,l,s 
CP1 CP2 or CP3 or which is capable of hybndizmg to B subt,lis CP1 , CP2 or CPS under 
conditions of high stringency and wh.ch encodes an ammo acd sequence having cysteme 
protease activity. 

The terms "isolated" or "purified" as used here.n refer to a nucleic acd or ammo acd 
that is removed from at least one component with ,vh !C h it is naturally associated 

As used herein the term "heterologous protein" refers to a protein or polypeptide 
that does not naturally occur in a gram-pos,tive host cell Examples of heterologous 
prote.ns include enzymes such as hydrolases including proteases, cellulases, amylases, 
carbohydrases, and Upases; isomerases such as racemases, epimerases, tautomerases, or 
mutases transferases, kinases and phophatases The heterologous gene may encode 
therapeutically significant proteins or peptides, such as growth factors, cytokines, ligands. 
receptors and inhibitors, as well as vaccines and ant.bod.es. The gene may encode 
commercially important industrial proteins or peptides, such as proteases, carbohydrases 
such as amylases and glucoamylases, cellulases. oxidases and Upases The gene of 
interest may be a naturally occurring gene, a mutated gene or a synthetic gene. 

The term "homologous protein" refers to a protein or polypeptide native or naturally 
occurnng in a gram-positive host cell. The mvent.on includes host cells producing the 
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homologous protein via recombinant DNA technology. The present invention encompasses 
a gram-positive host celt having a deletion or interruption of naturally occurring nucleic acid 
encoding the homologous protein, such as a protease, and having nucleic acid encoding 
the homologous protein, or a variant thereof, re-introduced in a recombinant form. In 
5 another embodiment, the host cell produces the homologous protein. 

As used herein, the term "overexpressing" when refering to the production of a 
protein in a host cell means that the protein is produced in greater amounts than its 
production in its naturally occurring environment. 

As used herein, the phrase "proteolytic activity" refers to a protein that is able to 
10 hydrolyze a peptide bond. Enzymes having proteolytic activity are described in Enzyme 
Nomenclature, 1992, edited Webb Academic Press, Inc. 

Detaiied Description of the Preferred Embodiments 

The unexpected discovery of the cysteine proteases CP1, CP2 and CP3 in B.subtilis 

is provides a basis for producing host cells, expression methods and systems which can be 
used to prevent the degradation of recombinantly produced heterologous proteins. In a 
preferred embodiment, the host cell is a gram-positive host cell that has a deletion or 
mutation in the naturally occurring cysteine protease said mutation resulting in deletion or 
inactivation of the production by the host cell of the proteolytic cysteine protease gene 

20 product The host cell may additionally be genetically engineered to produced a desired 
protein or polypeptide. 

It may also be desired to genetically engineer host cells of any type to produce a 
gram-positive cysteine protease. Such host cells are used in large scale fermentation to 
produce large quantities of the cysteine protease which may be isolated or purified and 

25 used in cleaning products, such as detergents. 

I. Cysteine Protease Sequences 

The CP1, CP2 and CP3 polynucleotides having the sequences as shown in Figures 
1A-1B, 5A-5B and 6A-6B, respectively, encode the Bacillus subtilis cysteine proteases CP1, 
M CP2 and CP3. As will be understood by the skilled artisan, due to the degeneracy of the 
genetic code, a variety of polynucleotides can encode the Bacillus subtilis CP1, CP2 and 
CP3. The present invention encompasses all such polynucleotides. 

The present invention encompasses CP1, CP2 and CP3 polynucleotide homologs 
encoding gram-positive microorganism cysteine proteases CP1, CP2 and CP3, respectively, 
is which have at least 80%, or at least 90% or at least 95% identity to B subtilis CP1 , CP2 and 
CP3 as long as the homolog encodes a protein that has proteolytic activity. 

Gram-positive polynucleotide homologs of B.subtilis CP1 , CP2 or CP3 may be obtained 
by standard procedures known in the art from, for example, cloned DNA (e.g., a DNA "library") 
genomic DNA libraries, by chemical synthesis once identified, by cDNA cloning, or by the 
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cloning of genomic DNA, or fragments thereof, purified from a desired cell. (See, for example, 
Sambrook er a/., 1989, Molecular Cloning, A Laboratory Manual, 2d Ed., Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York; Glover, D M. (ed.), 1985, DNA Cloning: A 
Practical Approach, MRL Press, Ltd., Oxford, U.K. Vol. I, II.) A preferred source is from 
genomic DNA. Nucleic acid sequences derived from genomic DNA may contain regulatory 
regions in addition to coding regions. Whatever the source, the isolated CP1, CP2 or CP3 
gene should be molecuiarly cloned into a suitable vector for propagation of the gene. 

In the molecular cloning of the gene from genomic DNA, DNA fragments are generated, 
some of which will encode the desired gene. The DNA may be cleaved at specific sites using 
various restriction enzymes. Alternatively, one may use DNAse in the presence of manganese 
to fragment the DNA, or the DNA can be physically sheared, as for example, by sonication. 
The linear DNA fragments can then be separated according to size by standard techniques, 
including but not limited to, agarose and polyacrylamide gel electrophoresis and column 
chromatography. 

Once the DNA fragments are generated, identification of the specific DNA fragment 
containing the CP1 , CP2 or CP3 may be accomplished in a number of ways. For example, 
a B.subtiHs CP1, CP2 or CP3 gene of the present invention or its specific RNA, or a 
fragment thereof, such as a probe or primer, may be isolated and labeled and then used in 
hybridization assays to detect a gram-positive CP1, CP2 or CP3 gene (Benton, W. and 
Davis, R., 1977, Science 196 :180; Grunstein, M And Hogness, D., 1975, Proc. Natl. Acad. 
Sci. USA 72:3961). Those DNA fragments sharing substantial sequence similarity to the 
probe will hybridize under stringent conditions. 

Accordingly, the present invention provides a method for the detection of gram- 
positive CP1, CP2 and CP3 polynucleotide homologs which comprises hybridizing part or all 
of a nucleic acid sequence of B. subtilis CP1, CP2 and CP3 with gram-positive 
microorganism nucleic acid of either genomic or cDNA origin. 

Also included within the scope of the present invention are gram-positive 
microorganism polynucleotide sequences that are capable of hybridizing to the nucleotide 
sequence of B. subtilis CP1, CP2 or CP3 under conditions of intermediate to maximal 
stringency. Hybridization conditions are based on the melting temperature (Tm) of the 
nucleic acid binding complex, as taught in Berger and Kimmel (1987, Guide to Molecular 
Cloning Techniques Methods in Enzymoloqy, Vol 152, Academic Press, San Diego CA) 
incorporated herein by reference, and confer a defined "stringency" as explained below. 

"Maximum stringency" typically occurs at about Tm-5°C (5°C below the Tm of the 

probe); "high stringency" at about 5°C to 10°C below Tm; "intermediate stringency" at about 
10°C to 20°C below Tm; and "low stringency" at about 20°C to 25°C below Tm. As will be 
understood by those of skill in the art, a maximum stringency hybridization can be used to 
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identify or detect identical polynucleotide sequences while an intermediate or low stringency 
hybridization can be used to identify or detect polynucleotide sequence homologs. 

The term "hybridization" as used herein shall include "the process by which a strand 
of nucleic acid joins with a complementary strand through base pairing" (Coombs J (1994) 
Dictionary of Biotechnology, Stockton Press, New York NY). 

The process of amplification as carried out in polymerase chain reaction (PCR) 
technologies is described in Dieffenbach CW and GS Dveksler (1995, PCR Primer, a 
Laboratory Manual, Cold Spring Harbor Press, Plainview NY). A nucleic acid sequence of 
at least about 10 nucleotides and as many as about 60 nucleotides from B. subtilis CP1 , 
CP2 or CP3 preferably about 12 to 30 nucleotides, and more preferably about 20-25 
nucleotides can be used as a probe or PCR primer. 

The B. subtilis amino acid sequences CP1 , CP2 and CP3 (shown in Figures 2, 4 and 
3, respectively) were identified via a FASTA search of Bacillus subtilis genomic nucleic acid 
sequences. B. subtilis CP1 (YJDE) was identified by its structural homology to the cysteine 
protease papain having the sequence designated "papa_carpa.p". As shown in Figure 2, 
YJDE has the motif GXCWAF as well as the conserved catalytic residues His/Ala and 
Asn/Ser. CP2 (YdHS) and CP3 (PMI) were identified upon their structural homology to CP1 
(YJDE). The presence of GXCWAF as well as residues His/Ala and Asn/Ser is noted in 
Figures 3 and 4. CP3 (PMI) was previously characterized as a possible phosphomannose 
isomerase, (Noramata). There has been no previous characterization of CP3 as a cysteine 
protease. 

II. Expression Systems 

The present invention provides host cells, expression methods and systems for the 
enhanced production and secretion of desired heterologous or homologous proteins in 
gram-positive microorganisms. In one embodiment, a host cell is genetically engineered to 
have a deletion or mutation in the gene encoding a gram-positive CP1 , CP2 or CP3 such 
that the respective activity is deleted. In another embodiment of the present invention, a 
gram-positive microorganism is genetically engineered to produce a cysteine protease of 
the present invention. 

Inactivation of a gram-positive cysteine protease in a host cell 
Producing an expression host cell incapable of producing the naturally occurring 
cysteine protease necessitates the replacement and/or inactivation of the naturally occurring 
gene from the genome of the host cell. In a preferred embodiment, the mutation is a non- 
reverting mutation. 

One method for mutating nucleic acid encoding a gram-positive cysteine protease is 
to clone the nucleic acid or part thereof, modify the nucleic acid by site directed 



PCT/US98/14529 



mu tagenes,s and re.ntroduce the mutated nucle.c acid into the cell on a plasm.d. By 
homologous recombination, the mutated gene may be .ntroduced into the chromosome. In 
the parent host cell the result is that the naturally occurring nucleic ac.d and the mutated 
nucleic acid are located in tandem on the chromosome. After a second recombination, the 
modified sequence is left in the chromosome having thereby effectively introduced the 
mutation into the chromosomal gene for progeny of the parent host cell. 

Another method for ,nactivat,ng the cysteine protease proteolytic activity is through 
deleting the chromosomal gene copy. In a preferred embodiment, the entire gene is 
deleted the deletion occurring in such as way as to make reversion impossible. In another 
preferred embodiment, a partial deletion is produced, provided that the nucle.c ac.d 
sequence left in the chromosome is too short for homologous recombination with a plasm.d 
encoded cysteine protease gene. In another preferred embodiment, nucleic ac.d encoding 
the catalytic amino acid residues are deleted. 

~ eleii0n cf the naturally occurring gram-DOsitive microorganism cysteine protease 
can be corned out as follows A cysteine protease gene including its 5" and 3' regions is 
isolated and inserted into a clon.ng vector. The coding region of the cysteine protease gene 
is deleted form the vector in vitro, leaving beh.nd a sufficient amount of the 5' and 3 
flanking sequences to provide for homologous recombination with the naturally occurring 
gene in the parent host cell. The vector is then transformed into the g ram-pos,t,ve host cell. 
The vector integrates into the chromosome via homologous recombination ,n the flank.ng 
regions This method leads to a gram-pos,t,ve strain in which the protease gene has been 
deleted 

The vector used in an integration method is preferably a plasmid. A selectable 
marker may be included to allow for ease of identification of desired recombinant 
microorgansims. Additionally, as will be appreciated by one of skill in the art, the vector is 
preferably one which can be selectively integrated into the chromosome. This can be 
achievea by introducing an .nducible or.g:r. cf replication, for example, a temperature 
sensitive origin into the plasm.d. By growing the transformants at a temperature to which 
the origin of replication is sensitive, the replication function of the plasm.d is inactivated, 
thereby providing a means for selection of chromosomal integrants Integrants may be 
selected for growth at high temperatures in the presence of the selectable marker, such as 
an antibiotic. Integration mechanisms are described in WO 88/06623. 

Integration by the Campbell-type mechanism can take place in the 5' flank.ng region 
of the protease gene, result.ng in a protease positive stra.n carrying the entire plasmid 
vector in the chromosome in the cysteine protease locus Since illegitimate recombination 
will give different results it will be necessary to determine whether the complete gene has 
been deleted, such as through nucleic acid sequencing or restriction maps 

Another method of inactivating the naturally occurring cysteine protease gene <s to 
mutagenize the chromosomal gene copy by transforming a gram-pos,t,ve microorganism 



WO 99/04016 



9 



PCT/US98/14529 



with oligonucleotides wh.ch are mutagenic. Alternatively, the chromosomal cysteine 
protease gene can be replaced with a mutant gene by homologous recombination. 

The present invention encompasses host cells having deletions or mutat.ons of a 
cysteine protease of the present invention as well as additional protease deletions or 
mutations, such as deletions or mutations in apr, npr. epr, mpr and others known to those of 
skill in the art United States Patent 5,264,366 discloses Baallus host cells having a 
deletion of apr and npr; United States Patent 5,585,253 d.scloses Bacillus host cells hav.ng 
a deletion of epr; Margot et al., 1996, Microbiology 142: 3437-3444 disclose host cells 
having a deletion in wpr and EP patent 0369817 discloses Bacillus host cells having a 
deletion of mpr. 

One assay for the detection of mutants involves growing the Bacillus host cell on 
medium containing a protease substrate and measuring the appearance or lack hereof, of 
a zone of clearing or halo around the colonies. Hasi ceiis which have an inactive protease 
will exhibit little or no halo around the colonies. 

ill Production of Cysteine Protease 

For production of cysteine protease in a host cell, an expression vector comprising at 
least one copy of nucleic acid encoding a gram-positive microorganism CP1 , CP2 orCP3, 
and preferably comprising multiple copies, is transformed into the host cell under conditions 
suitable for expression of the cysteine protease. In accordance with the present invention, 
polynucleotides which encode a gram-positive microorganism CP1 , CP2 or CP3, or 
fragments thereof, or fusion proteins or polynucleotide homolog sequences that encode 
amino acid variants of B.subt,l,s CP1 , CP2 or CP3, may be used to generate recombinant 
DNA molecules that direct their expression in host cells. In a preferred embod.ment, the 
gram-pos,t,ve host cell belongs to the genus Bacillus. In another preferred embodiment, the 
gram positive host cell is B subtilis 

As will be understood by those of skill in the art, it may be advantageous to produce 
polynucleotide sequences possessing non-naturally occurring codons. Codons preferred by 
a particular gram-posit,ve host cell (Murray E et al (1989) Nuc Acids Res 17:477-508) can 
be selected for example, to increase the rate of expression or to produce recombinant RNA 
transcripts having desirable propert.es, such as a longer half-life, than transenpts produced 
from naturally occurring sequence. 

Altered CP1 . CP2 or CP3 polynucleotide sequences which may be used in 
accordance with the invention include deletions, insertions or substitutions of different 
s nucleotide residues resulting in a polynucleotide that encodes the same or a functionally 
equivalent CP1, CP2 or CP3 homolog, respectively As used herein a "deletion" is defined 
as a change in either nucleotide or ammo acid sequence in which one or more nucleotides 
or amino acid residues, respectively, are absent 
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As used herein an "insertion" or "addition" is that change in a nucleotide or amino 
acid sequence which has resulted in the addition of one or more nucleotides or amino acid 
residues, respectively, as compared to the naturally occurring CP1. CP3 or CP3. 

As used herein "substitution" results from the replacement of one or more 
nucleotides or amino acids by different nucleotides or amino acids, respectively. 

The encoded protein may also show deletions, insertions or substitutions of ammo 
acid residues which produce a silent change and result in a functionally CP1 , CP2 or CP3 
variant. Deliberate amino acid substitutions may be made on the basis of similarity in 
polarity, charge, solubility, hydrophobic^, hydrophilicity, and/or the amphipathic nature of 
the residues as long as the variant retains the ability to modulate secretion. For example, 
negatively charged amino acids include aspartic acid and glutamic acid; positively charged 
amino acids include lysine and arginine; and ammo acids with uncharged polar head groups 
having similar hydrophilicity values include leucine, isoleucme, valine; glycine, alanine; 
a<;paragine. glutamine; serine, threonine, phenylalanine, and tyrosine. 

The CP1, CP2 or CP3 polynucleotides of the present invention may be engineered 
in order to modify the cloning, processing and/or expression of the gene product. For 
example, mutations may be introduced using techniques which are well known in the art, 
eg, site-directed mutagenesis to insert new restriction sites, to alter glycosylation patterns or 
to change codon preference, for example. 

In one embodiment of the present invention, a gram-positive microorganism CP1, 
CP2 or CP3 polynucleotide may be hgated to a heterologous sequence to encode a fusion 
protein. A fusion protein may also be engineered to contain a cleavage site located 
between the cysteine protease nucleotide sequence and the heterologous protein 
sequence, so that the cysteine protease may be cleaved and purified away from the 
heterologous moiety. 

IV Vector Sequences 

Expression vectors used in expressing tne cysteine proteases of the present 
invention in gram-positive microorganisms comprise at least one promoter associated with a 
cysteine protease selected from the group consisting of CP1, CP2 and CP3, which promoter 
is functional in the host cell. In one embodiment of the present invention, the promoter is 
the wild-type promoter for the selected cysteine protease and in another embodiment of the 
present invention, the promoter is heterologous to the cysteine protease, but still functional 
in the host cell. In one preferred embodiment of the present invention, nucleic acid 
encoding the cysteine protease is stably integrated into the microorganism genome. 

In a preferred embodiment, the expression vector contains a multiple cloning site 
cassette which preferably comprises at least one restriction endonuclease site unique to the 
vector, to facilitate ease of nucleic acid manipulation In a preferred embodiment, the vector 
also comprises one or more selectable markers. As used herein, the term selectable 
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marker refers to a gene capable of expression in the gram-positive host which allows for 
ease of selection of those hosts containing the vector. Examples of such selectable 
markers include but are not limited to antibiotics, such as, erythromycin, actinomycin, 
chloramphenicol and tetracycline. 

V. Transformation 

A variety of host cells can be used for the production of CP1, 
CP2 and CP3 including bacterial, fungal, mammalian and insects cells. General 
transformation procedures are taught in Current Protocols In Molecular Biology (vol. 1, 
edited by Ausubei et al., John Wiley & Sons, Inc. 1987, Chapter 9) and include calcium 
phosphate methods, transformation using DEAE-Dextran and electroporation. Plant 
transformation methods are taught in Rodnquez (WO 95/14099, published 26 May 1995). 

In a preferred embodiment, the host cell is a gram-positive microorganism and in 
another preferred embodiment, the host cell is Bacillus. In one embodiment of the present 
invention, nucleic acid encoding one or more cysteine protease(s) of the present invention is 
introduced into a host cell via an expression vector capable of replicating within the Bacillus 
host cell. Suitable replicating plasmids for Bacillus are described in Molecular Biological 
Methods for Bacillus, Ed. Harwood and Cutting, John Wiley & Sons, 1990, hereby expressly 
incorporated by reference; see chapter 3 on plasmids. Suitable replicating plasmids for B. 
subtilis are listed on page 92. 

In another embodiment, nucleic acid encoding a cysteine protease(s) of the present 
invention is stably integrated into the microorganism genome. Preferred host cells are 
gram-positive host cells Another preferred host is Bacillus. Another preferred host is 
Bacillus subtilis. Several strategies have been described in the literature for the direct 
cloning of DNA in Bacillus. Plasmid marker rescue transformation involves the uptake of a 
donor plasmid by competent cells carrying a partially homologous resident plasmid 
(Contente et al., Plasmid 2:555-571 (1979); Haima et al., Mol. Gen. Genet. 223:185-191 
(1990); Weinrauch et al., J. Bacterid 154(3):1077-1087 (1983); and Weinrauch et al., J. 
Bacteriol. 169(3):1205-121 1 (1987)). The incoming donor plasmid recombines with the 
homologous region of the resident "helper plasmid in a process that mimics chromosomal 
transformation. 

Transformation by protoplast transformation is described for B. subtilis in Chang and 
Cohen, (1979) Mol. Gen. Genet 168:111-115; for B.megatenum in Vorobjeva et al., (1980) 
FEMS Microbiol. Letters 7:261-263; for B amyloliquefaciens in Smith et al., (1986) Appl. 
and Env. Microbiol. 51:634; for B.thunngiensis in Fisher et al., (1981) Arch. Microbiol. 
139:213-217; for B.sphaericus in McDonald (1984) J. Gen Microbiol. 130:203; and B. larvae 
in Bakhiet et al., (1985) 49:577. Mann et al.. (1986, Current Microbiol 13:131-135) report 
on transformation of Bacillus protoplasts and Holubova, (1985) Folia Microbiol. 30:97) 
disclose methods for introducing DNA into protoplasts using DNA containing liposomes. 
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VI. Identification of Transfcrmants 

Whether a host cell has been transformed with a mutated or a naturally occurring 
gene encoding a gram-positive CP1, CP2 or CP3. detection of the presence/absence of 
marker gene expression can suggests whether the gene of interest is present However, its 
expression should be confirmed. For example, if the nucleic acid encoding a cysteine 
protease is inserted within a marker gene sequence, recombinant cells containing the insert 
can be identified by the absence of marker gene function Alternatively, a marker gene can 
be placed in tandem with nucleic acid encoding the cysteine protease under the control of a 
single promoter. Expression of the marker gene in response to induction or selection 
usually indicates expression of the cysteine protease as well. 

Alternatively, host cells which contain the coding sequence for a cysteine protease 
and express the protein may be identified by a variety of procedures known to those of skill 
in the art. These procedures include, but are not limited to, DNA-DNA or DNA-RNA 
rv-bridization and protein hmassay or immunoassay techniques which include membrane- 
based, solution-based, or chip-based technologies for the detection and/or quantification of 
the nucleic acid or protein. 

The presence of the cysteine polynucleotide sequence can be detected by DNA- 
DNA or DNA-RNA hybridization or amplification using probes, portions or fragments of 
B.subtilis CP1, CP2 or CP3. 

VII. Assay of Protease Activity 

There are various assays known to those of skill in the art for detecting and 
measuring protease activity. There are assays based upon the release of acid-soluble 
peptides from casein or hemoglobin measured as absorbance at 280 nm or colonmetncally 
using the Folm method (Bergmeyer, et al., 1984, Methods of Enzymatic Analysis vol. 5, 
Peptidases, Proteinases and their Inhibitors. Verlag Chemie, Wemheim). Other assays 
involve the solubilization of chromogenic substrates (Ward, 1983, Proteinases, in Microbial 
Enzymes and Biotechnology (W.M. Fogarty. ed ), Applied Science, London, pp. 251-317). 

VIII. Secretion of Recombinant Proteins 

Means for determining the levels of secretion of a heterologous or homologous 
protein in a gram-positive host cell and detecting secreted proteins include, using either 
polyclonal or monoclonal antibodies specific for the protein. Examples include enzyme- 
linked immunosorbent assay (ELISA), radioimmunoassay (RIA) and fluorescent activated 
cell sorting (FACS) These and other assays are described, among other places, in 
Hampton R et al (1990, Serological Methods, a Laboratory Manual, APS Press, St Paul MN) 
and Maddox DE et al (1983, J Exp Med 158:1211). 

A wide variety of iabels and conjugation techniques are known by those skilled in the 
art and can be used in various nucleic and amino acid assays Means for producing labeled 
hybridization or PCR probes for detecting specific polynucleotide sequences include 
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oligolabeling, nick translation, end-labeling or PCR amplification using a labeled nucleot.de. 
Alternatively, the nucleotide sequence, or any portion of it, may be cloned into a vector for 
the production of an mRNA probe. Such vectors are known in the art, are commercially 
available, and may be used to synthesize RNA probes in vitro by addition of an appropriate 
RNA polymerase such as T7, T3 or SP6 and labeled nucleotides. 

A number of companies such as Pharmacia Biotech (Piscataway NJ), Promega 
(Madison Wl), and US Biochemical Corp (Cleveland OH) supply commercial kits and 
protocols for these procedures. Suitable reporter molecules or labels include those 
radionuclides, enzymes, fluorescent, chem.lum.nescent, or chromogenic agents as well as 
substrates, cofactors, inhibitors, magnetic particles and the like. Patents teaching the use 
Of such labels include US Patents 3,817,837, 3,850,752, 3,939,350, 3,996,345, 4,277,437; 
4 275,149 and 4,366,241. Also, recombinant immunoglobulins may be produced as shown 
in raieiu inu. m,o iu.ow, c, 

IX. Purification of Proteins 

Gram positive host cells transformed with polynucleotide sequences encoding 
heterologous or homologous protein may be cultured under conditions suitable for the 
expression and recovery of the encoded protein from cell culture. The protein produced by 
a recombinant gram-positive host cell comprising a mutation or deletion of the cysteine 
protease activity will be secreted into the culture media. Other recombinant constructions 
may join the heterologous or homologous polynucleotide sequences to nucleotide sequence 
encoding a polypeptide domain which will facilitate purification of soluble proteins (Kroll DJ 
et al (1993) DNA Cell Biol 12:441-53). 

Such purification facilitating domains include, but are not limited to, metal chelating 
peptides such as histidine-tryptophan modules that allow purification on immobilized metals 
(Porath J (1992) Protein Expr Punf 3.263-281 ). protein A domains that allow purification on 
immobilized immunoglobulin, and the domain utilized in the FLAGS extension/affinity 
purification system (Immunex Corp, Seattle WA). The inclusion of a cleavable linker 
sequence such as Factor XA or enterokinase (Invitrogen, San Diego CA) between the 
purification domain and the heterologous protein can be used to facilitate purification. 

X. Uses of The Present Invention 

CP1. CP2 and CP3 and Genetically Engi neered Host Cells 

The present invention provides genetically engineered host cells comprising 
preferably non-revertable mutations or deletions in the naturally occurring gene encoding 
CP1, CP2 or CP3 such that the proteolytic activity is diminished or deleted altogether The 
host cell may contain additional protease deletions, such as deletions of the mature subtilisn 
protease and/or mature neutral protease disclosed in United States Patent No. 5,264,366. 
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In a preferred embodiment, the host cell is further genetically engineered to produce 
a desired protein or polypeptide. In a preferred embodiment the host cell is a Baailus. In 
another preferred embodiment, the host cell is a Bacillus subtihs. 

In an alternative embodiment, a host cell is genetically engineered to produce a 
gram-positive CP1, CP2 or CP3. In a preferred embodiment, the host cell is grown under 
large scale fermentation conditions, the CP1 , CP2 or CP3 is isolated and/or purified and 
used in cleaning compositions such as detergents. Detergent formulations are disclosed in 
WO 95/10615. A cysteine protease of the present invention can be useful in formulating 
various cleaning compositions. A number of known compounds are suitable surfactants 
useful in compositions comprising the cysteine protease of the invention These include 
nonionic, anionic, cationic, anionic or zwitterionic detergents, as disclosed in US 4,404.128 
and US 4,261,868. A suitable detergent formulation is that described in Example 7 of US 
Patent 5,204,01 5. The art is familiar with the different formulations which can be used as 
cleaning compos.t.ons In addition, a cysteine protease of the present mvention can be 
used, for example, in bar or liquid soap applications, dishcare formulations, contact lens 
cleaning solutions or products, peptide hydrolysis, waste treatment, textile applications, as 
fusion-cleavage enzymes in protem production, etc. A cysteine protease may prov.de 
enhanced performance in a detergent composition (as compared to another detergent 
protease). As used herein, enhanced performance in a detergent -s defined as increasing 
cleaning of certain enzyme sensitive sta.ns such as grass or blood, as determined by usual 
evaluation after a standard wash cycle. 

A cysteine protease of the present invention can be formulated into known 
powdered and liquid detergents hav.ng P H between 6.5 and 12.0 at levels of about .01 to 
about 5% (preferably . 1 % to 5%) by weight These detergent cleaning compositions can 
also include ether enzymes such as known proteases amylases cellulases, lipases or 
endogiycosidases, as well as builders and stabilizers. 

The addition of a cysteine protease to conventional cleaning compositions does not 
create any special use limitation In other words, any temperature and pH suitable for the 
detergent is also suitable for the present compositions as long as the pH is within the above 
range, and the temperature is below the described cysteine protease denaturing 
temperature. In addition, a cysteine protease can be used in a cleaning composition without 
detergents, again either alone or in combination with builders and stabilizers. 

One aspect of the invention is a composition for the treatment of a textile that 
includes a cysteine protease of the present invention The composition can be used to treat 
for example silk or wool as described in publications such as RD 216,034, EP 134,267; US 
4,533.359; and EP 344.259 
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Proteases can be included in animal feed such as part of animal feed additives as 
described in, for example, US 5,612,055, US 5,314,692; and US 5,147,642. 

C P1. CP2 and CP3 Polynucleotides 

A B.subtiis polynucleotide, or any part thereof, provides the basis for detecting the 

presence of gram-positive microorganism polynucleotide homologs through hybridization 
techniques and PCR technology. 

Accordingly, one aspect of the present invention is to provide for nucleic acid 
hybridization and PCR probes which can be used to detect polynucleotide sequences, 
including genomic and cDNA sequences, encoding gram-positive CP1, CP2 or CP3 or 
portions thereof. 

The manner and method of carrying out the present invention may be more fully 

, x, „i,;n ;„ thn -,r+ Hw roforonro tn thp fnllnwinn PxamDles which 
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examples are not intended in any manner to limit the scope of the present .nvention or of 
the claims directed thereto 

Example I 
Preparation of a Genomic library 
The following example illustrates the preparation of a Bacillus genomic library. 
Genomic DNA from Bacillus cells is prepared as taught in Current Protocols In 
Molecular Biology vol. 1, edited by Ausubel et al., John Wiley & Sons, Inc. 1987, chapter 2. 
4.1. Generally, Bacillus cells from a saturated liquid culture are lysed and the proteins 
removed by digestion with proteinase K. Cell wall debris, polysaccharides, and remaining 
proteins are removed by selective precipitation with CTAB, and high molecular weight 
genomic DNA is recovered from the resulting supernatant by isopropanol precipitation. If 
exceptionally clean genomic DNA is desired, an additional step of purifying the Bacillus 
genomic DNA on a cesium chloride gradient is added. 

After obtaining purified genomic DNA, the DNA is subjected to Sau3A digestion. 
Sau3A recognizes the 4 base pair site GATC and generates fragments compatible with 
several convenient phage lambda and cosmid vectors. The DNA is subjected to partial 
digestion to increase the chance of obtaining random fragments. 

The partially digested Bacillus genomic DNA is subjected to size fractionation on a 
1% agarose gel prior to cloning into a vector. Alternatively, size fractionat.on on a sucrose 
gradient can be used. The genomic DNA obtained from the size fractionation step is 
purified away from the agarose and ligated into a cloning vector appropriate for use in a 
host cell and transformed into the host cell 
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Example II 

Detection of qram-postive microorg anisms 

The following example describes the detection of gram-positive microorganism CP1. 
The same procedures can be used to detect CP2 and CP3. 

DNA derived from a gram-positive microorganism is prepared according to the 
methods disclosed in Current Protocols in Molecular Biology, Chap. 2 or 3. The nucleic acid 
is subjected to hybridization and/or PCR amplification with a probe or primer derived from 
CP1. A preferred probe comprises the nucleic acid section containing the conserved motif 
GXCWAF. 

The nucleic acid probe is labeled by combining 50 pmol of the nucleic acid and 250 

mCi of [gamma 32 P] adenosine triphosphate (Amersham, Chicago IL) and T4 

polynucleotide kinase (DuPont NEN®, Boston MA). The labeled probe is purified with 

Sephadex G-25 super fine resin column (Pharmacia). A portion containing 10 7 counts per 

minute of each is used in a typical membrane based hybridization analysis of nucleic acid 

sample of either genomic or cDNA origin. 

The DNA sample which has been subjected to restriction endonuclease digestion is 

fractionated on a 0.7 percent agarose gel and transferred to nylon membranes (Nytran Plus, 
Schleicher & Schueli, Durham NH). Hybridization is carried out for 16 hours at 40 degrees 
C. To remove nonspecific signals, blots are sequentially washed at room temperature 
under increasingly stringent conditions up to 0.1 x saline sodium citrate and 0.5% sodium 
dodecyl sulfate. The blots are exposed to film for several hours, the film developed and 
hybridization patterns are compared visually to detect polynucleotide homologs of B.subtilis 
CP1. The homologs are subjected to confirmatory nucleic acid sequencing. Methods for 
nucleic acid sequenc.ng are well known in the art. Conventional enzymatic methods employ 
DNA polymerase Klenow fragment, SEQUENASE® (US Biochemical Corp, Cleveland, OH) 
orTaa oo'ymerase to extend DNA chains from an oligonucleotide primer annealed to the 
DNA template of interest 

Various other examples and modifications of the foregoing description and examples 
will be apparent to a person skilled in the art after reading the disclosure without departing 
from the spirit and scope of the invent.on, and it is intended that all such examples or 
edifications be included within the scope of the appended claims All publications and 
patents referenced herein are hereby incorporated by reference in their entirety 



